Some Current Quantitative Problems in Corpus Linguistics and a Sketch of Some Solutions

نویسنده

  • Stefan Th. Gries
چکیده

This paper surveys a variety of methodological problems in current quantitative corpus linguistics. Some problems discussed are from corpus linguistics in general, such as the impact that dispersion, type frequencies/ entropies, and directionality (should) have on the computation of association measures as well as the impact that neglecting the sampling structure of a corpus can have on a statistical analysis. Others involve more specialized areas in which corpus-linguistic work is currently booming, such as historical linguistics and learner corpus research. For each of the problems, first ideas/pointers as to how these problems can be resolved are provided and exemplified in some detail.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Critical Discourse Analysis of Hedges and Boosters in Iranian TV Election Debates of Presidential Candidates

To win the attention of the audience, presidential candidates rely on their own rhetorical methods. Hedges and boosters as metadiscourse markers have been the focus of many studies as the communicative strategies enabling speakers to soften the force of utterances or moderate their assertive force. TV news was used as the corpus of this study, whereas most of the previous studies have focused o...

متن کامل

ACADEMIC WRITING REVISITED: A PHRASEOLOGICAL ANALYSIS OF APPLIED LINGUISTICS HIGH-STAKE GENRES FROM THE PERSPECTIVE OF LEXICAL BUNDLES

Lexical bundles are frequent word combinations that commonly appear in different registers. They have been the subject of much research in the area of corpus linguistics during the last decade. While most previous studies of bundles have mainly focused on variations in the use of these word combinations across different registers and a number of disciplines, not much research has been done to e...

متن کامل

Linguateca's infrastructure for Portuguese and how it allows the detailed study of language varieties

In this paper I present briefly Linguateca, an infrastructure project for Portuguese which is ten years old, and will show how it provides several possibilities to study grammatical and semantical differences between varieties of the language. After a short history of Portuguese corpus linguistics, presenting the main projects in the area, I discuss in some detail the AC/DC project (Santos & Bi...

متن کامل

Research Article Introductions: Sub-disciplinary Variations in Applied Linguistics

The present study aimed to investigate the generic organization of research article introductions in local Iranian and international journals in English for Specific Purposes, English for General Purposes, and Discourse Analysis. Overall, 120 published articles were selected from the established journals representing the above subdisciplines. Each subdiscipline was represented by 20 local and 2...

متن کامل

A Genre Analysis of the Introduction Section of Applied Linguistics and Chemistry Research Articles

This study investigated the cross-disciplinary variations in the generic structure of Introduction sections of 52 Applied Linguistics and 52 Chemistry research articles drawing upon Swales’ (2004) framework, taking into account the new insights proposed by Bhatia (2004), Shehzad (2008), and Lim (2012, 2014). To this end, in addition to collecting quantitative data and conducting frequency and C...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015